# Litellm as backend

Lighteval allows to use litellm, a backend allowing you to call all LLM APIs
using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure,
OpenAI, Groq etc.].

Documentation for available APIs and compatible endpoints can be found [here](https://docs.litellm.ai/docs/).

## Quick use

```bash
lighteval endpoint litellm \
    "gpt-3.5-turbo" \
    "lighteval|gsm8k|0|0"
```

## Using a config file

Litellm allows generation with any OpenAI compatible endpoint, for example you
can evaluate a model running on a local vllm server.

To do so you will need to use a config file like so:

```yaml
model:
  base_params:
    model_name: "openai/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
    base_url: "URL OF THE ENDPOINT YOU WANT TO USE"
    api_key: "" # remove or keep empty as needed
  generation:
    temperature: 0.5
    max_new_tokens: 256
    stop_tokens: [""]
    top_p: 0.9
    seed: 0
    repetition_penalty: 1.0
    frequency_penalty: 0.0
```

## Use Hugging Face Inference Providers

With this you can also access HuggingFace Inference servers, let's look at how to evaluate DeepSeek-R1-Distill-Qwen-32B.

First, let's look at how to acess the model, we can find this from [the model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B).

Step 1:

![Step 1](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lighteval/litellm-guide-2.png)

Step 2:

![Step 2](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/lighteval/litellm-guide-1.png)

Great ! Now we can simply copy paste the base_url and our api key to eval our model.

> [!WARNING]
> Do not forget to prepend the provider in the `model_name`. Here we use an
> openai compatible endpoint to the provider is `openai`.

```yaml
model:
  base_params:
    model_name: "openai/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"
    base_url: "https://router.huggingface.co/hf-inference/v1"
    api_key: "YOUR KEY" # remove or keep empty as needed
  generation:
    temperature: 0.5
    max_new_tokens: 256 # This will overide the default from the tasks config
    top_p: 0.9
    seed: 0
    repetition_penalty: 1.0
    frequency_penalty: 0.0
```

And then, we are able to eval our model on any eval available in Lighteval.

```bash
lighteval endpoint litellm \
    "examples/model_configs/litellm_model.yaml" \
    "lighteval|gsm8k|0|0"
```
